Simpson’s Paradox in the interpretation of “leaky pipeline” data

نویسندگان

  • Paul H. Walton
  • Daniel J. Walton
چکیده

The traditional ‘leaky pipeline’ plots are widely used to inform gender equality policy and practice. Herein, we demonstrate how a statistical phenomenon known as Simpson’s paradox can obscure trends in gender ‘leaky pipeline’ plots. Our approach has been to use Excel spreadsheets to generate hypothetical ‘leaky pipeline’ plots of gender inequality within an organisation. The principal factors, which make up these hypothetical plots, can be input into the model so that a range of potential situations can be modelled. How the individual principal factors are then reflected in ‘leaky pipeline’ plots is shown. We find that the effect of Simpson’s paradox on leaky pipeline plots can be simply and clearly illustrated with the use of hypothetical modelling and our study augments the findings in other statistical reports of Simpson’s paradox in clinical trial data and in gender inequality data. The findings in this paper, however, are presented in a way, which makes the paradox accessible to a wide range of people.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Computational Social Scientist Beware: Simpson's Paradox in Behavioral Data

Observational data about human behavior is often heterogeneous, i.e., generated by subgroups within the population under study that vary in size and behavior. Heterogeneity predisposes analysis to Simpson’s paradox, whereby the trends observed in data that has been aggregated over the entire population may be substantially different from those of the underlying subgroups. I illustrate Simpson’s...

متن کامل

Integrating Bayesian Networks and Simpson’s Paradox in Data Mining

This paper proposes to integrate two very different kinds of methods for data mining, namely the construction of Bayesian networks from data and the detection of occurrences of Simpson’s paradox. The former aims at discovering potentially causal knowledge in the data, whilst the latter aims at detecting surprising patterns in the data. By integrating these two kinds of methods we can hopefully ...

متن کامل

Simpson’s Paradox – A Survey of Past, Present and Future Research

Simpson’s paradox refers to the reversal of a statistical relationship between two variables in sub-populations when the sub-populations are combined and analyzed as a population. This article is intended to provide a broad survey of the past, present and future research surrounding the issue. Real data from a discrimination litigation case is examined to identify the occurrence of the paradox....

متن کامل

How Likely is Simpson's Paradox in Path Models?

Simpson’s paradox is a phenomenon arising from multivariate statistical analyses that often leads to paradoxical conclusions; in the field of e-collaboration as well as many other fields where multivariate methods are employed. We derive a general inequality for the occurrence of Simpson’s paradox in path models with or without latent variables. The inequality is then used to estimate the proba...

متن کامل

How Likely is Simpson’s Paradox?

What proportion of all 2× 2× 2 contingency tables exhibit Simpson’s Paradox? An approximate answer is obtained for large sample sizes and extended to 2×2×l tables. Several conditional probabilities of the occurrence of Simpson’s Paradox are also derived. Given that the observed cell frequencies satisfy a Simpson reversal, the posterior probability that the population parameters satisfy the same...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2017